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ABSTRACT 

Measuring radio source counts is critical for characterizing new extragalactic populations, 
brings a wealth of science within reach and will inform forecasts for SKA and its pathfinders. 
Yet there is currently great debate (and few measurements) about the behaviour of the 1.4- 
GHz counts in the /iJy regime. One way to push the counts to these levels is via ‘stacking’, the 
covariance of a map with a catalogue at higher resolution and (often) a different wavelength. 
For the first time, we cast stacking in a fully bayesian framework, applying it to (i) the SKADS 
simulation and (ii) VLA data stacked at the positions of sources from the VIDEO survey. In 
the former case, the algorithm recovers the counts correctly when applied to the catalogue, 
but is biased high when confusion comes into play. This needs to be accounted for in the 
analysis of data from any relatively-low-resolution SKA pathfinders. For the latter case, the 
observed radio source counts remain flat below the 5-er level of 85 /iJy as far as 40 /iJy, then 
fall off earlier than the flux hinted at by the SKADS simulations and a recent P{D ) analysis 
(which is the only other measurement from the literature at these flux-density levels, itself 
extrapolated in frequency). Division into galaxy type via spectral-energy distribution reveals 
that normal spiral galaxies dominate the counts at these fluxes. 

Key words: methods: data analysis - methods: statistical - surveys - galaxies: evolution - 
radio continuum: galaxies - radio continuum: general 


1 INTRODUCTION 

Measurements of radio source counts provided some of the earli¬ 
est cosmological support for an expanding Universe, but today they 
can be used to discover and characterize new extragalactic popula¬ 
tions (see e.g. Massardi et al. 2011; Mahony et al. 2011; Whittam 
et al. 2013; Franzen et al. 2014) and/or to shed light on galaxy evo¬ 
lution in allowing (together with redshift information) inference of 
star-formation rates and luminosity functions (see e.g. Dunne et al. 
2009; Karim et al. 2011; Roseboom and Best 2014; Zwart et al. 
2014a). It is also critical to pin them down in advance of the new 
wave of radio telescopes such as MeerKAT, ASKAP and SKAl: 
since confusion noise (Scheuer 1957; Condon 1974, 1992; Con¬ 
don et al. 2002, 2012) depends directly on the source counts and a 
telescope’s synthesized beam, the level of confusion noise is both 
a function of these two and informs the desired telescope config¬ 
uration. It is therefore essential to measure counts for both tele¬ 
scope design and survey forecasting for SKA and its pathfinders 
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(see e.g. Zwart et al. 2014b). See also de Zotti et al. (2010) for a 
review of the counts at a range of radio frequencies. 

There is currently considerable debate about the behaviour of 
1.4-GHz differential source counts at < /zjy levels, ARCADE2 
(Seiffert et al. 2011; Fixsen et al. 2011) finding them to be rela¬ 
tively flat (in the Euclidean normalization, i.e. flux density S 2 ' 5 ) 
below lOOpJy, but with VLA 3-GHz measurements by Condon 
et al. (2012) hinting that they are rather steeper (roughly ex S’ -1 ' 5 
from that work) at those flux-density levels. The work of Owen 
and Morrison (2008) showed the need for careful corrections for 
source angular diameters when assembling 1.4-GHz source counts. 
Heywood et al. (2013) suggested the considerable variation in the 
source counts at 100/rJy could be attributed to sample variance. 
Indeed, a more-detailed analysis (see below) of the Condon et al. 
data by Vernstrom et al. (2014) confirmed the steep counts down to 
sub-pjy levels; their result is still compatible with the Seiffert et al. 
data only with a new, somewhat extreme population of very faint 
sources. Finally, the state-of-the-art SKADS simulations (Wilman 
et al. 2008) also require the source counts to be steep (and steepen¬ 
ing) at the /xjy level. With all this in mind, there is a clear need for 
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more and complementary measurements in this area, especially in 
the run up to SKA1 and as its pathfinders come online. 

Three main methods exist for determining source counts. In 
the first, traditionally astronomers simply counted directly-detected 
sources in flux bins above some well-motivated threshold (usually 
5 x the rms noise for the radio-source catalogue, but e.g. AMI Con¬ 
sortium et al. 2011b even used 4.62x rms). 

Second, Scheuer (1957) showed that faint, unresolved sources 
buried within the thermal noise could contribute a variance 
term from their (positive) flux as well as from their (negative) 
synthesized-beam sidelobes. Hence (see also above) measuring the 
noise allows one to constrain source counts (a so-called probabil¬ 
ity of deflection, or P(D), analysis) if the confusion contribution 
is dominant. Hewish (1961) was the first to use Scheuer’s blind 
method, and it was most recently implemented (in the image plane) 
at 3 GHz by Condon et al. (2012) and Vernstrom et al. (2014), then 
extrapolated to 1.4 GHz. 

The third approach is ‘stacking’. One would like to push 
source-count measurements as deep as possible, but a P(D) anal¬ 
ysis is not possible unless a radio map is confusion-noise-limited. 
However, if one knew in advance the positions of sources below 
the formal detection threshold, for example from some deeper aux¬ 
iliary catalogue, one could constrain source counts below the de¬ 
tection threshold using this prior information and accounting for 
that thermal noise. Such a selection effect is a both a blessing and 
a curse: stacking allows for a robust ‘source-frame’ definition and 
subsequent subdivision of the sample, but cannot ever capture prop¬ 
erties of a population defined by an ‘observer-frame’ blind analysis 
such as P(D). The method has been applied in a number of dif¬ 
ferent settings, e.g. in polarization (Stil et al. 2014), in the sub-mm 
(Patanchon et al. 2009) and at 1.4 GHz (Seymour et al. 2008; Hales 
et al. 2014a,b). 

Marsden et al. (2009) formally define stacking as taking the 
covariance of a map with a catalogue (at any two wavebands), but 
in fact it has many flavours in the literature as pointed out by Zwart 
et al. (2014a). In the past (Dunne et al. 2009; Karim et al. 2011; 
Zwart et al. 2014a) some single statistic (usually an average such 
as the mean or median) has been used to summarize the properties 
of radio pixels stacked at the positions from the auxiliary catalogue, 
allowing calculation of, for example, star-formation rates in rnass- 
redshift bins. The precision on any inferred parameters scales as 
Agources (Roseboom and Best 2014), so for (say) 40,000 sources 
a statistical factor of 200 in sensitivity below the survey threshold 
could in theory be achieved. 

However, the data afford more information than what is en¬ 
coded in a single summary statistic (and careful attention must be 
paid to potential biases in those analyses; see Zwart et al. 2014b). 
For example. Bourne et al. (2012) identify three potential sources 
of bias in median stacking: the flux limits and shape of the ‘true’ 
underlying flux distribution, and the amplitude of the thermal noise. 
Rather than being viewed as biases, such extra information in the 
observed data can be exploited in order to infer the parameters of 
some underlying physical model. 

This was demonstrated by Mitchell-Wynne et al. (2014) who 
constrained 1.4-GHz source counts down to ct/ 10 from FIRST 
maps (Becker et al. 1995) stacked, at positions from deeper COS¬ 
MOS data (Scoville et al. 2007), by fitting a power-law model to the 
noisy map-extracted flux distribution. Using a similar parametric 
method, Roseboom and Best (2014) included redshift information 
in order to measure the evolution of the 1.4-GHz luminosity func¬ 
tion for near-infrared-selected galaxies, which they called ‘para¬ 
metric stack-fitting’. 


In this work we extend the algorithm of Mitchell-Wynne et al. 
(2014) in the following ways: (i) We cast the problem in a fully 
bayesian framework rather than in a purely maximum-likelihood 
one; (ii) in our framework we consider the use of the bayesian 
evidence for selecting between different source-count models; we 
demonstrate the algorithm on the SKADS simulations; and (iv) we 
apply the algorithm to a deep ( K s < 23.5) mass-selected data 
sample from the VIDEO survey (Jarvis et al. 2013). 

In section 2 we describe our bayesian framework, including 
the models, priors and likelihoods we use. We give some details of 
simulations undertaken in section 3, with a description of the near- 
infrared and radio observations and data in section 4. We present 
our results and discussion in section 5 before concluding in sec¬ 
tion 6. 

We assume radio spectral indices a are such that Sj, oc v a 
for a source of flux density S v at frequency v. All coordinates 
are epoch J2000. Magnitudes are in the AB system. We assume 
a ACDM ‘concordance’ cosmology throughout, with f2 m = 0.3, 
f) A = 0.7 and Ho = 70 krns -1 Mpc -1 (Komatsu et al. 2011). 


2 BAYESIAN FRAMEWORK 
2.1 Bayes’ Theorem 

Bayesian analyses of astronomical data sets have become com¬ 
monplace in the last ten years (see e.g. Feroz et al. 2009b, Feroz 
et al. 2011, Zwart et al. 2011, Lochner et al. 2015) because of their 
many advantages. The targeted posterior probability distribution 
V (©|D, H) of the values of the parameters ©, given the avail¬ 
able data D and a model H (the model is a hypothesis plus any 
assumptions), comes from Bayes’ theorem: 


P{&\D,H) 


£(D|0,7T)77(©|H) 

Z(D\H) 


( 1 ) 


In the numerator, the likelihood £(Dj©,77), i.e. the probabil¬ 
ity distribution of the data given parameter values and a model, 
encapsulates the experimental constraints. The prior 77(©|7T) 
records prior knowledge of or prejudices about the values of the 
parameters. The bayesian evidence, Z (DjiT), is the integral of 
£ (D|©, H) n (©| H) over all ©. Naively it normalizes the pos¬ 
terior in parameter space; crucially it facilitates selection between 
different models when their evidences are compared quantitatively 
(Occam's razor; see e.g. MacKay 2003). 


2.2 Sampling 

Carrying out the evidence integrations and sampling the parameter 
space has not in the past been easy and was often slow. Nested sam¬ 
pling (Skilling 2004) was invented specifically to reduce the com¬ 
putational cost of evidence calculations, since no posterior sam¬ 
ple (each of which is obtained ‘for free’) is ever wasted. How¬ 
ever, the integrations are exponential in the number of model pa¬ 
rameters, typically limiting that number to O(10 2 ) on current ma¬ 
chines. A particularly robust and efficient implementation of nested 
sampling is MultiNest (Feroz and Hobson 2008; Feroz et al. 
2009a; Buchner et al. 2014), which permits sampling from poste¬ 
riors that are multimodal and/or unusually shaped. Posterior distri¬ 
butions are represented by full distributions rather than a summary 
mean/median value and a (perhaps covariant) uncertainty, since this 
represents the total inference about the problem at hand, and error 
propagation is fully automatic. 
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In what follows we used a python interface (Buchner et al. 
2014) to MultiNest, deployed in parallel on (typically) 96 pro¬ 
cessors. Because of the unusually-shaped posterior distributions, 
there were often as many as 0 ( 100 , 000 ) likelihood evaluations. 
However, the resources required for the algorithm are not heavy 
on either memory or disk space: it is the number of parallel pro¬ 
cessors that controls the (rejection) sampling efficiency and hence 
total wall execution time. 


2.3 Models considered here 

We treat four source-count models here, each encoding some func¬ 
tional form with a number of parameters and incorporating priors 
on each of those parameters, plus any assumptions. The evidence 
will be employed to select between several piecewise-linearly- 
interpolated power-law models (A, B, C and D) by way of example, 
but any parametric model could be used be it a modified power law, 
a polynomial or a pole/node-based model (see e.g. Bridges et al. 
2009, Vemstrom et al. 2014). 

2.3.1 Model A 

For the single-power law, we have simply that 


Table 1. Priors 17 (©|77) adopted here. 


Parameter 

Prior 

C/sr — 1 Jy —1 

log-uniform £ [10 -5 ,10 7 ] 

a , /3, 7 , 5 

uniform E [—2.5,—0.1] 

Smin / ft] y 

uniform E [0.01, 20.0] 

Smax/ (4^ y 

uniform E [20.0,100.0] 

So, 1,2 

Uniform E \Smini Smax\ 

So, 1,2 

further require So < Si < S 2 

°Vi 

& (^survey)) 


uniform e [0.5, 2.0] o- surV ey 


2.4 Priors 

It is straightforward to vary the priors 17 (0jT7) on parameters 0; 
the ones we adopt are shown in Table 1. In particular the power-law 
normalization C, as a scale parameter, is given a logarithmic prior, 
and any power-law breaks are free to vary their positions. 

We assume all model hypotheses to be equally likely ab ini¬ 
tio, which is equivalent to considering solely the Bayes factor 
Z\ (D|17i) lZ /2 (DITT 2 ) when comparing models. 


dIV 

dS 


(67, a, S , 


min 5 rj-max ) 



Smin < s < Smax 

otherwise 


( 2 ) 


where ^ is the differential count for N sources in the flux interval 
[S, S + els'] (Jy). C is a normalization constant (sr - 1 Jy -1 ), a is 
a slope, and the model is set to zero outside the lower and upper 
limits Smin and S ma x- We denote the parameter vector ©a = 

{67, OL, Smin, Smax} • 


2.3.2 Models B, C and D 

Model B extends Model A with a second power law (two extra 
parameters), i.e. it incorporates a second slope /3 with a break at So 

(S min < So < Smaxb 


Smin < S < So 
So < S < Smax , (3) 

otherwise 

with a parameter vector ©b = {67, a, /3, So, Smin, S max }. Model 
C incorporates another break in the power law, so ©c = 
{67, a, (3, 7 , So, Si, S m in, Smax}, and Model D, a third break, 
i.e. © D = {67, a, /3, 7 , 5, So, Si, S 2 , S min , S max }. Priors on the 
different model parameters are discussed in section 2.4. 


diV 

(67, OL, /l, So, Smin , Smax') 

( 67S“ 

= ] CS“- p S p 
0 


2.3.3 Models A', B', C 1 and D 1 

In section 5 we investigate some scenarios where we allow the flux- 
measurement noise o n (see section 2.5) to be a free parameter of 
the fit. We denote these models A', B', C' and D' corresponding to 
their fixed-noise counterparts A. B, C and D respectively. 


2.5 Likelihood Function 

In what follows we drop an explicit dependence on P but do note 
our assumptions along the way. We cannot use a simple gaussian 
(y 2 ) likelihood function because the measurement uncertainties are 
not themselves gaussian: since we are dealing with binned data we 
adopt a poisson likelihood. For the i th bin containing fc, objects, 
the corresponding likelihood Ci (fc;| 0 ) is 

c l {k l \e) = 1 ^^, ( 4 ) 

where 



diV(S) fSm i+ AS m , 



( S-Sm ) 2 

2cr n 


(5) 

The second integral accounts for the gaussian map noise. U is, in 
a sense, a mock realisation of the count in a particular bin from a 
specific set of parameter values, to be compared with the real values 
encoded by k. Folded into U are the flux-measurement noise a n 
(assumed here to be the same for all objects, i.e. constant across the 
field, Oj = cr n Vj sources) and the bin limits [S m; , S mi + AS m J. 
S and Sm are respectively the measured and noise-free fluxes for 
the sources in a given bin, related by S ~ JV (S m , a 2 ). Carrying 
out the second integration in equation 5 gives 



d S 


dN(S) 
d S 


erf 


S -S m , 


— erf 


S-(S m ,+AS m .) 

etn 


(6) 


Assuming independent bin entries, the likelihoods multiply (the 
log-likelihoods adding) to give the total likelihood for the ribins 
bins: 
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n bins 

£(k|0) = Yl £,(fc,:|0). (7) 

2 = 1 

When combined with the priors from Table 1, inserting equation 7 
into equation 1 yields the posterior probability distribution (for pa¬ 
rameter estimation) and the bayesian evidence (for model selec¬ 
tion). 

As a final point, it is possible to jointly infer an assumed- 
constant measurement noise, including any contribution from con¬ 
fusion noise, at the same time as the source-count model param¬ 
eters. Hence the source-count model could simultaneously be ex¬ 
tended to even fainter fluxes, as well as to the inclusion of brighter 
sources above the detection threshold. 


With the formalism for the priors and likelihoods in place, we have 
all of the ingredients we need to calculate posterior probability dis¬ 
tributions and the evidences for each model. We now turn our at¬ 
tention to testing the algorithm on simulations. 


3 SIMULATIONS 

The Square Kilometre Array Design Studies SKA Simulated Skies 
(SKADS-S 3 ) simulation (Wilman et al. 2008, 2010) is a semi- 
empirical model of the extragalactic radio-continuum sky covering 
an area of 400 deg 3 , from which we extracted a 1-deg 2 catalogue at 
1.4 GHz for the purposes of testing our method. The simulation, 
which is the most recent available, incorporates both large- and 
small-scale clustering and has a flux limit of lOnJy. We undertook 
several tests as follows. 

First, we took all « 375,000 of the noise-free 1.4-GHz fluxes 
(< 85/rJy, i.e. the 5 -a limit of our radio map — see section 4.2) 
and added gaussian random noise of standard deviation equal to 
the modal VLA map 1 -a noise (i.e. 16.2 /.Jy). We divided the re¬ 
sulting mock catalogue into 38 bins, —68 < S/p Jy < 85, with the 
lower edge of the lowest bin being the flux of the faintest source in 
the noisy catalogue. Fitting each of models A, B, C and D in turn, 
we found that model B (one break) returned the highest evidence 
(Table 2). Figure 1 is the posterior probability distribution of the 
parameters for that model. 

A bayesian framework allows us to explore the full poste¬ 
rior, rather than simply determining the position of its peak, even 
though the distribution is highly non-gaussian. For example, there 
is a strong degeneracy between C and a; there is a hard diago¬ 
nal edge in the positions of adjacent power-law breaks because of 
the constraint that e.g. Sq < S\ < S 2 ', alternate power laws tend 
to be anti-correlated in their parameters; and overall there is little 
symmetry in the posterior. A fully bayesian analysis such as ours 
is essential because so many of the assumptions of a maximum- 
likelihood approach have broken down. The bayesian evidence is 
in turn the appropriate tool to be used for model selection in place 
of a A\ 2 — an estimator that would not have told the whole story. 

We have deliberately not tabulated maximum a posteriori 
(MAP), maximum-likelihood or ‘best-fit’ parameter values because 
even the 1-D marginalized posteriors are misleading in their mask¬ 
ing of the intricate characteristics of the full posterior probability 
distribution. Point estimates such as the maximum-likelihood and 
MAP are subject to noise and hence are not very representative 
of the full posterior. Rather, we reconstruct source counts directly 
from the ensemble of posterior samples, as follows. Figure 2 shows 


Table 2. Evidences A Z for models A-D applied to the full (1-deg 2 ) 
SKADS catalogue, relative to model A. The preferred model is B. 


Model 

A log e 2 

Odds ratio Z /Za 

A 

0.0 

1 

B 

0.87 ± 0.20 

2.39:1 

C 

0.83 ±0.21 

2.29:1 

D 

0.34 ±0.21 

1.40:1 



Figure 2. Source counts reconstructed from the SKADS mock catalogue. 
The underlying, noise-free realisation of the counts for the 1 -deg 2 catalogue 
is indicated by the green dashed line. The real, noisy catalogue data are 
shown as a black dashed line. The black line and grey shaded region show 
the maximum a posteriori (MAP) and 68-per-cent confidence region of the 
distributions of models reconstructed from every sample from the posterior 
in Figure 1. The vertical blue lines are at I rj and 5cr. 

the source-count reconstruction based on (the winning) model B. 
For each sample from the posterior, we have generated a corre¬ 
sponding source-count reconstruction evaluated at each flux-bin 
centre; the uncertainties are defined as the 68-per-cent confidence 
region around the median. We have recentred those uncertainties on 
the MAP reconstruction. Note that the uncertainties are underesti¬ 
mates because they do not account for correlations in data space: 
for example, high signal-to-noise data (e.g. at the bright end) will 
generally drive the uncertainties in the power law (which is fitted 
over the full flux range, making the bins not independent), but the 
error bars inflate at the faint end where the signal-to-noise is low 
(see below). 

The reconstructed count is consistent with the underlying 
SKADS count to below 1 pjy and plausibly as low as 0.3 piy (given 
also the input realisation of that underlying count; green dashed 
line). The counts are overestimated at the lowest fluxes, where we 
note that there is little signal-to-noise. In fact the algorithm ex¬ 
hibits 1/s/N behaviour: for N = 375,000, one might hope to 
reach a factor of « s/N below the 5-<r survey threshold of 85/rJy, 
i.e. 0.1 pty, which is indeed roughly what we see. 

To place further constraints on the depth the algorithm can 
reach, we fixed the position of the faintest break in model C and 
investigated the effects on the reconstructed counts. Figure 3 shows 
what happens as the break is positioned at 10, 5, 1 and 0.5 pjy. The 
reconstruction is accurate until it breaks down at «0.5-l/rJy, in 
the sense that the uncertainties suddenly inflate when the enforced 
break position is decreased to that flux. Note that this is consistent 
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Figure 1. Posterior probability distribution for the (winning) single-break model B fitted to the full SKADS mock catalogue data. The 68 and 95 per cent 
confidence limits are respectively indicated by the dark and light shaded regions. 


with our assertion above that uncertainties from different bins affect 
each other, because when the break position is clamped at a very 
low flux the uncertainties for the lowest flux bins reflect the fact 
that the counts in this noise-dominated region are independent of 
the brighter flux bins. 

3.1 Effects of clustering and confusion 

Next, in order to account for the effects of clustering and confu¬ 
sion, we injected all the noise-free SKADS sources into a map of 
noise, synthesized beam, size and pixel size equal to those of the 
VLA map of section 4.2. Having injected all the noise-free SKADS 
sources into the map, flux extraction proceeds as in section 4.3, for 
catalogue positions corresponding to those of injected sources. 

For this experiment, we fitted model C (with two breaks) to 
the binned extracted fluxes. We used model C here because we 
wanted to allow sufficient freedom in the possibly now-different 
reconstruction and it was only marginally disfavoured over model 
B. We carried out the fit with the minimum source flux injected into 
the map set to be 0.01,0.1,0.5 and 1 /rjy. Figure 4 shows the recon¬ 
structions for these different scenarios. With all sources >10nJy 
injected into the map, the counts are biased high, showing that, if 
the SKADS simulations portrayed reality, confusion would cause 


the algorithm to break down when applied to these VLA data even 
at the 5 -<t flux limit. Confusion is still an issue (top right in Fig¬ 
ure 4) for S > 0.1/rJy, causing the counts to be artificially boosted, 
but drops away once all sources < 0.5/rJy have been excised from 
the map (bottom left). This is consistent with the level at which 
confusion noise is expected in the VLA data (see section 4.2). 

While potentially disconcerting, this exercise is useful in un¬ 
derstanding the limitations of our modelling scheme. If SKADS is 
a true representation of the 1.4-GHz sky, our source counts would 
be biased. If confusion were an issue, efficient deblending algo¬ 
rithms have been proposed (e.g. Kurczynski and Gawiser 2010; 
Safarzadeh et al. 2015) to mitigate the effects of source blend¬ 
ing. We note that this effect may lead to problems for any planned 
relatively-low-resolution surveys from SKA pathfinders. We return 
to the effects of confusion in section 5.3. 

4 DATA 

4.1 Infrared and Optical Data 

The VIDEO survey (Jarvis et al. 2013), aimed at understanding 
galaxy formation, evolution and clusters, is ongoing. The data pre¬ 
sented here cover « 1 square degree and are in Z, Y, J, H, and 
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Figure 3. For the SKADS catalogue data, source-count reconstruction for model C as a function of the flux (indicated by the vertical red dashed line) at which 
the lowest break is fixed: (a) 10, (b) 5, (c) 1 and (d) 0.5 //,ly. The underlying, noise-free realisation of the counts for the 1-deg 2 catalogue is indicated by the 
green dashed line. The real, noisy catalogue data are shown as a black dashed line. The black line and grey shaded region show the maximum a posteriori 
(MAP) and 68-per-cent confidence region of the distributions of models reconstructed from every sample from the posterior. The vertical blue lines are at lcr 
and 5cr. 


K s bands. The project makes use of Canada-France-Hawaii Tele¬ 
scope Legacy Survey optical data (CFHTLS-D1; Ilbert et al. 2006) 
in the u*, g', r', i 1 , and z' bands. 

The data selection is similar to that employed by Zwart et al. 
(2014a). In order to select galaxies solely by stellar mass (see 
Zwart et al. 2014a and discussion therein), we considered just 
those with K s < 23.5, i.e. the 90-per-cent completeness limit for 
VIDEO’S formal 5-cr limit K s = 23.8 (see Jarvis et al. 2013). 
We further excised any objects contaminated by detector ghost¬ 
ing halos, as well as any objects that SEXTRACTOR deemed to 
be crowded, blended, saturated, truncated or otherwise corrupted. 
71,418 objects remained after this data selection, covering an area 
of 0.97 sq. deg. 

4.2 Radio Data 

Radio observations (Bondi et al. 2003) of the VIDEO field cover 
a 1 -square-degree field centred on J 2 h 26 m 00 s —4° 30' 00" (the 
XMM-LSS field). The data comprise a 1.4-GHz Very Large Array 
(VLA, B-array) 9-point mosaic. For any radio mosaicking strategy, 
the primary beam unavoidably imposes a variation in the map noise 
(see e.g. AMI Consortium et al. 2011a); for these data this variation 


is about 20 per cent around 17.5/rjy. The CLEAN restoring beam 
is 6" full width at half maximum and the map contains 2048 x 
2048 1.5-arcsec pixels. In Zwart et al. (2014a) we estimated the 
expected confusion noise for these 6-arcsec-beam data to be a* = 
0.8 fjtJy beam -1 in the Condon et al. (2012) definition, i.e. about 
a n /20 in our notation. 

4.3 Flux extraction 

It was pointed out by Stil et al. (2014) that a Nyquist-sampled syn¬ 
thesized beam could be insufficient for a stacking analysis if the 
positional uncertainty exceeds the width of a pixel. In the present 
case, the two options were to (i) employ aperture photometry or (ii) 
resample the map to a finer grid. Since our near-infra-red field is 
relatively crowded, the first would prove difficult without some de¬ 
blending scheme (e.g. Kurczynski and Gawiser 2010; Safarzadeh 
et al. 2015). Hence we selected the second method, upsampling 
the VLA map by a factor of 8 using interpolation. The scheme 
was tested on simulations and on the detected sources (i.e. those 
above 5a) and the bias in flux density was able to be reduced to 
< 1 per cent. 

A histogram of the extracted fluxes (up to 500 fijy) is shown in 
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Figure 4. Source-count reconstruction for model C as a function of the minimum flux injected into the map: (a) 0.01, (b) 0.1, (c) 0.5 and (d) 1 ply. The 
underlying, noise-free realisation of the counts for the 1-deg 2 catalogue is indicated by the green dashed line. The real, noisy catalogue data are shown as a 
black dashed line. The black line and grey shaded region show the maximum a posteriori (MAP) and 68-per-cent confidence interval of the distributions of 
models reconstructed from every sample from the posterior. The vertical blue lines are at lcr and 5a. 


Figure 5. Although gaussian on the negative side (implying dom¬ 
ination by thermal noise), there is a long positive tail caused by 
discrete radio sources; the tail contains the information that we will 
exploit to measure the source-count distribution for these sources. 


5 RESULTS 

With the algorithm and its implementation fully tested and proven, 
we set about extracting binned source counts at the positions of the 
VIDEO-detected K s -selected sources as described in section 4.3. 
We used 41 bins in the range —108 < S/p Jy < 85, with tighter 
binning near zero to reflect the increased numbers of sources in that 
flux region. 

The relative log c -evidences (for fixed <j„) are given in Ta¬ 
ble 3. The preferred model, i.e. the one with the highest evidence, 
is model D. The posterior probability distribution for that model is 
given in Figure 7. Figure 8 shows the reconstructions for the four 
models. A single power law (model A) has a slope that is consistent 
with that from Wilman et al. (2008) and Vernstrom et al. (2014), but 
the amplitude is relatively low and the fit at the bright end is poor. 
The evidence anyway strongly rejected this model. When a break is 


added (model B), the fit at the bright end becomes consistent with 
the > 5 <j counts from Bondi et al. (2003). 

The model C and D reconstructions are very similar, and our 
conclusions remain the same even though D is formally preferred. 
Taking model C (with two breaks), then, the counts remain flat to 
just below 20/rJy (~ la), before falling steeply. After the second 
break, at around 3ply, the counts become shallower and do not 
become consistent with those from Wilman et al. (2008) and Vern¬ 
strom et al. (2014) again until < 0.5/rJy. 


5.1 Sensitivity to thermal-noise contribution 

As a final check, we repeated the same fitting procedure, but al¬ 
lowing for an extra degree of freedom in that a n was now al¬ 
lowed to vary uniformly in the range 0.5-2.0 x cr surV ey,modal. 
This was because there was a hint (see for example Figure 5) that 
the assumed map noise was an underestimate. Table 4 shows the 
model-selection results. Immediately one sees that the relative evi¬ 
dences are overwhelmingly higher for this family of models when 
compared to the fixed-noise set. Model B' is now the preferred 
model, though only marginally compared to A', C ; and D', but 
it is nonetheless model B' that we formally adopt. Figure 9 gives 
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Figure 5. Histogram of the fluxes extracted from the VLA map at the posi¬ 
tions of the VIDEO sources. Overlaid with a red dashed curve is a gaussian 
of width equal to the modal survey noise of 16.2 // Jy. The vertical blue lines 
indicate the 1- and 5-cr thresholds. 


the corresponding reconstructions and Figure 10 is the posterior 
distribution for model B'. The intermediate break in the counts 
now disappears, indicating that the earlier models had probably 
been compensating for an incorrect noise figure, the model now 
being summarizable as flat to « 40/iJy, then falling with a slope of 
— 1.7 below that flux. The map noise estimated via the new route 
is 17.5 ± 0.08/rJy rather than the modal value of 16.2/Jy assumed 
earlier. 


5.2 Interpretation 

Heywood et al. (2013) showed that the scatter in the counts at 
100/My could be attributed to sample variance, but this becomes 
less true at fainter fluxes as the raw numbers of sources increase, 
so that it is unclear that this could scatter our counts. Assuming, 
then, no contribution from sample variance, we might interpret the 
very slight deficit of radio emission in that region to indicate that 
some small fraction of the Vemstrom et al. (2014) radio sources are 
beyond the faintness limit of the VIDEO data. Note that the dif¬ 
ference in the counts cannot be due to confusion, since we found 
earlier that confusion tended to bias the counts upwards. We see 
also that our inferred source counts are not consistent with those 
derived from the ARCADE2 data (Seiffert et al. 2011 and Fixsen 
et al. 2011), which afforded integral constraints on the 3-90-GHz 
temperature contribution of a putative population of non-Galactic 
discrete radio sources below some limiting flux, constraining the 
differential counts to be oc S~ 2 ' 6 , i.e. very flat in the Euclidean 
normalization. Our results thus give further weight to the argument 
of Vemstrom et al. (2014) (see their Figure 18) that the ARCADE2 
‘excess’ cannot be accounted for by a population of faint discrete 
sources. 


5.3 Could there be a confusing background? 

As a final test, we undertook map extractions at 72,000 random po¬ 
sitions for both the SKADS simulations and the VIDEO-VLA data. 
This should give an indication of the confusion ‘background’ away 
from the stacked positions. No > 5<r sources were injected into the 
SKADS map. In the case of the VIDEO data, we masked out the 
known (i.e. > 5a) sources from the Bondi et al. (2003) catalogue 


Table 3. Evidences A Z for models A-D applied to the VIDEO data, rela¬ 
tive to model A. It appears that the preferred model at this stage is D, but 
see Table 4. 


Model 

A log e 2 

Odds ratio Z / Za 

A 

0.0 

i 

B 

7.44±0.22 

10 3 ■ 2 :1 

C 

21.24±0.24 

10 9 2 :1 

D 

27.65±0.24 

10 12 °:1 

Table 4. Evidences A Z for models A' 

-D' applied to the VIDEO data, rel- 

ative to model A. The preferred model 

is now (just) B ; . 

Model 

A log e 2 

Odds ratio Z / Za 

A 

0.0 

i 

A' 

152.51±0.21 

10 66 23 :1 

B' 

152.86±0.21 

10 66 39 :1 

C' 

152.70±0.21 

10 66 32 :1 

D' 

152.48±0.21 

10 66 22 :1 

using circular apertures of radius 5x the full width at half maxi¬ 
mum of the synthesized beam, random pixels being drawn without 
replacement from the remaining area (subsequently corrected for). 

The reconstructions for the preferred models (from a choice 

of A, B, C and D for SKADS, 

and from all eight models for 


VIDEO) fitted to the resulting flux histograms are shown in Fig¬ 
ure 6. The comparison is revealing. For the SKADS extraction, 
the inferred counts are flat in euclidean space, with a very slight 
ramp at the bright end, with small uncertainties caused by a high 
signal-to-noise measurement of the average background flux. The 
VIDEO counts have a ramp at 1-5(7 due to confusion from un¬ 
masked sources in that regime (consider, for example, flux boosting 
due to an unmasked 4.5(7 source). However, the overall amplitude 
is considerably suppressed relative to the signal from the true posi¬ 
tions (note the logarithmic scale), even including the ramp. More¬ 
over, the amplitude is suppressed by a factor of «20 compared to 
the SKADS prediction. We conclude that our results are not con¬ 
taminated by confusion at any notable level. 


5.4 Subdivision by galaxy type 

We subdivided the galaxy sample using the criteria of Zwart et al. 
(2014a), which are based on the best-fitting spectral-energy distri¬ 
bution (to the 10-band photometry), which Zwart et al. argue to 
be a better discriminator than a colour-colour diagram. The three 
subsample classifications are (i) ellipticals, (ii) normal (Sbc, Scd, 
low-mass irregular) galaxies and (iii) starburst galaxies. We fitted 
models A'-D' to the three subsamples from this scheme in order 
to investigate the relative contributions to the source counts of the 
different populations. The results are summarized in Table 5 and 
Figure 11. The counts are dominated by the more numerous ‘nor¬ 
mal’ galaxies, whose counts also exhibit the (full sample’s) flat 
behaviour down to ~ 20/My. The counts of the ellipticals do not 
flatten at the bright end, but fall at a shallower rate, one that is con¬ 
sistent with the very bright (> 1 mJy) slope, as expected for such 


2 These are available from 

http: //web . oapd. inaf . it/rstools/srccnt/srccnt-tables . html 
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Figure 6. Source-count reconstructions for the preferred models for SKADS (left, model B) and VIDEO (right, model D') when 72,000 random map positions 
are used for the stacking. The underlying, noise-free realisation of the counts for the 1-deg 2 catalogue is indicated by the green dashed line. The real, noisy 
catalogue data are shown as a black dashed line. The black line and grey shaded region show the maximum a posteriori (MAP) and 68-per-cent confidence 
interval of the distributions of models reconstructed from every sample from the posterior. The vertical blue lines are at lcr and 5a. The underlying full 
SKADS (Wilman et al. 2008) counts are shown in blue, and the only other counts at these depths (Vemstrom et al. 2014) are marked in red, with the 
68 per cent confidence interval in dashed red. The filled and empty blue stars respectively represent the Bondi et al. (2003) corrected and uncorrected counts 
for the VLA > 5 a sources. Literature values 2 are taken from the review by de Zotti et al. (2010). 


Table 5. Summary of analysis for the VIDEO galaxy-type subsamples. 


Type 

Galaxies 

Flux range / ^xJy 

Bins 

Preferred model 

All 

71,418 

-108-85 

41 

B' 

Elliptical 

5351 

-67-85 

38 

B' 

Normal 

54,879 

-108-85 

41 

C' 

Starburst 

11,187 

-69-85 

38 

B' 


AGN hosts. The starburst counts are not as flat as the overall trend 
as far as 20pdy, but do, like the other subsamples, steepen below 
that flux. Hence, as expected, the source counts are dominated by 
normal ‘spiral galaxies’ at these fluxes, with starburst galaxies and 
AGN making little contribution (our selection excludes QSOs). 


6 CONCLUSIONS 

(i) We have cast ‘sub-threshold stacking’ in a fully bayesian 
framework for the first time, including the ability to calculate the 
evidence for the purposes of model selection. 

(ii) As well as the bayesian evidence, our framework reveals 
the exploration of full posterior probability distributions, show¬ 
ing explicitly any degeneracies and/or correlations. We note that 
marginalized parameter estimates are incorrect if such degenera¬ 
cies are present as in these cases. 

(iii) We applied the algorithm to the SKADS simulations. When 
run on a SKADS catalogue, we were able to reconstruct the counts 
successfully down to sub-/ijy levels, i.e. the 5a n /\/N = 375000 
level. This also holds true when applied to data (below). 

(iv) We used the SKADS catalogue to simulate a VIDEO-VLA- 
like radio map from which we extracted fluxes at the positions of 
SKADS sources on which to run the algorithm. We showed that 
confusion biased the counts high unless we artifically assumed no 
radio emission below 1 pjy. This needs to be accounted for in anal¬ 


ysis of data from relatively-low-resolution SKA pathfinders such as 
ASKAP. 

(v) We applied the algorithm to VLA data stacked at VIDEO 
positions. A power-law model (D) with three breaks (the two at 
3 piy and 20 ply being the predominant ones) is preferred when 
the modal map noise of 16.2 ply is assumed. 

(vi) When the map noise is varied as a free parameter, a power- 
law model (B') with a single break is overwhelmingly preferred, 
and we adopt model B' (over the fixed-noise model D) as our final 
result. The inferred counts can be summarized as flat to « IQply, 
then falling with a slope of —1.7 below that flux. The map noise 
estimated via is this route is, rather, 17.5 ± 0.08/rJy. 

(vii) While one would not expect them to match a priori, we 
interpret the slight deficit of counts below 20 ply relative to the 
results of Wilman et al. (2008) and Vemstrom et al. (2014) as in¬ 
dicative of a fraction of the radio emission coming from galaxies 
fainter than the flux limit of the VIDEO catalogue, be they lower- 
redshift faint galaxies with some ongoing star formation, or higher- 
redshift moderately bright galaxies. Like those works, our results 
are not consistent with the ARCADE2 results or indeed those from 
the work of Owen and Morrison (2008). 

(viii) We have subdivided the VIDEO sample into ellipticals, 
normal and starburst galaxies, and fitted source count models to the 
binned flux distributions, finding that the counts are dominated by 
normal ‘spiral galaxies’ at these fluxes, with little contribution from 
starburst galaxies or AGN. 

(ix) We note the usefulness and wide applicability of our algo¬ 
rithm, and look forward to its employment in future radio surveys 
such as those with MeerKAT (Jarvis 2012), LOFAR (van Haarlem 
et al. 2013), ASKAP (Norris et al. 2011) and SKA (Norris et al. 
2013; Prandoni and Seymour 2014). 


6.1 Future work 

The radio luminosity, solely a function of source flux and redshift, 
can be used to infer star-formation rates, and, using the available 
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Figure 7. Posterior probability distribution for the three-break model D fitted to the VIDEO data. The dark and light shaded regions indicate the 68 and 
95 per cent confidence limits. 


stellar-mass estimates, specific star-formation rates (cf. Zwart et al. 
2014a). It will be straightforward to extend our algorithm to the 
derivation of luminosity functions (Malefahlo et al. in prep.), and 
their evolution with redshift, building on our work and that of Rose- 
boom and Best (2014). It will be interesting to compare estimates 
for these quantities with those derived from traditional stacking 
methods. As hinted at earlier, one would like to undertake a joint 
analysis of the counts of confusing sources below the thermal-noise 
limit together with those measured in this work, and our framework 
permits this too. Finally, the algorithm can readily be applied to any 
survey data map that has an auxiliary catalogue of greater depth 
and resolution, e.g. VLA-COSMOS. 10C (Whittam et al. in prep.), 
BLAST or Herschel- ATLAS, though in some low-resolution cases 
modification may be needed in order to account for a P(D )-style 
confusion contribution (high likelihood of beam occupancy >1). 
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Figure 11. Reconstructions for galaxies in the VIDEO sample, by galaxy subsample (all, elliptical, normal and starburst). The underlying full SKADS (Wilman 
et al. 2008) counts are shown in blue, and the only other counts at these depths (Vemstrom et al. 2014) are marked in red, with the 68 per cent confidence 
interval in dashed red. The filled and empty blue stars respectively represent the Bondi et al. (2003) corrected and uncorrected counts for the VLA > 50- 
sources. Literature values are taken from the review by de Zotti et al. (2010). The real, noisy data are shown as a black dashed line. The black line and grey 
shaded region show the maximum a posteriori (MAP) and 68-per-cent confidence interval of the distributions of models reconstructed from every sample 
from the posterior. The vertical blue lines are at lo- and 5o\ 
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